Instar Analytics BIG

Instar Analytics BIG is the distributed calculation version of Instar Analytics, able to process requests faster and with a potential wider range of samples up to a few millions households.

The key factor is the parallelization of template calculations by splitting the sample data into smaller pieces and aggregating the final results at the end.

Using a BIG configuration provides scalability, reliability and load balance in a transparent way for the final user.

The user interfaces are identical and there are no additional skills needed to use the BIG version.

The households in the dataset are split into N partitions, with an algorithm which tries to distribute the households in the most efficient way possible. These partitions will have identical or very similar number of households

One or more calculation nodes use each partition, and the Job distributor will balance the workload between the available slave replicas, providing the best performance. If more concurrent users are needed, more calculation nodes can be launched to cover this extra workload (permanently or just for high-demand peaks).

In case of failure of one calculation node, their replicas will handle their calculations, providing reliability for the final user in a transparent way.

Performance & Scalability

The fact of splitting the audience data into N subsets can lead to a drastic improvement in performance, compared to what can be obtained in a classical Instar Desktop with the complete dataset.

For an optimal performance it is recommended not to place too many slaves in a single server and use SSD disks (to avoid bottlenecks in the disk access). Also a good LAN connection between master and slave nodes is recommended.

As N (number of sample subsets) grows, the performance will be better but the hardware needs will increase proportionally, resulting in a higher cost. It is important to chose a suitable value with a good balance between cost and performance. N can be modified after the system has gone live, but in that case a re-partition of the historical data is required.

Cloud-Ready Application

Instar Analytics BIG is designed to be hosted on centralized servers, either on-premises or in commercial cloud-computing platforms like Amazon Web Services, Microsoft Azure, Alibaba's Aliyun or Google Cloud.

Instar BIG, as a scalable product, takes advantage of these cloud platforms flexibility and scalability.

Flexibility

Instar Analytics BIG is fully compatible with our other products:

It can be accessed through Instar Analytics IEA to retrieve data from Instar to be used in third party applications, custom-developed applications or to be merged with data from other sources.

It can work as backend for Instar Analytics Web.

Requirements

Master and Slave nodes have the same software requirements as a standard Instar Desktop installation.

Infrastructure: several servers to host the master and slaves; ideally 1 per slave + 1 per master, but this will depend on the trade-off between cost and performance, and the number of users who will access the product.

Recommended: SSD disks and good network connection.